What Makes a Hollywood Movie a Hit or a Flop?

Final Project
Data Science 1 with R (STAT 301-1)

Author

Celena Kim

Published

December 3, 2023

Introduction

As an avid movie lover, I have always been curious about what factors play into making some Hollywood movies critically acclaimed blockbusters while others fade into the background. Beyond solely the opening weekend numbers, I am interested in exploring the interplay between more extensive variables that contribute to a film’s propensity to ultimately be a hit or a flop. Specifically, I think it would be very interesting to focus on the five factors of critic and audience ratings, opening weekend revenue, gross (domestic, foreign, and worldwide), budget and budget recovery, and Oscar wins. I also am curious to see whether time of year/seasons have an impact on a movie’s success, and if there is a particular season in which the most successful movies are released. By focusing on these main variables for my analysis, I hope to explore my research question by discovering patterns and compelling correlations between the variables on a range of univariate to multivariate levels. I am interested in exploring whether certain variables affect another and how certain variables work together to contribute to a movie’s overall success rate. In order to carry out this analysis, I will be utilizing a data set found on the Kaggle website called “Hollywood Hits and Flops (2007 - 2023)”, described in the next section.

Data Overview and Quality

text

there were many variables not conducive to perform an analysis on, such as being character type vars and the oscar not being a bool

Explorations: What variables contribute to a movie’s overall success rate?

Variable 1: Ratings

Within this dataset, the 3 main movie rating measures are the Rotten Tomatoes score (audience and critic), Metacritic score (audience and critic), and IMDb rating.

Figure 1: A visualization of the change in audience and critic ratings of Hollywood movies from 2007-2022.

Figure 1 visualizes how the average of Rotten Tomatoes and Metacritic scores have changed over the years, separated by audience and critic rating groups. Overall, it appears that these ratings as a whole have increased since 2007. Additionally, the audience rating group seems to consistenly give higher ratings than the critic rating group. This analysis of ratings over the years serves to help us understand how much the numbers for the two rating groups of audience and critics differ, as well as visualizing the overall pattern of critic ratings over the years.

Figure 2: The average of Rotten Tomatoes critic ratings and Metacritic critic ratings for movies that have won at least one Oscar and movies who have not won any Oscars.

Figure 2 makes use of two measures of movie success that are determined solely by movie critics: Oscar wins and average critic movie ratings. As can be seen in the table, Hollywood movies that have won at least one Oscar award have a higher average of Rotten Tomatoes and Metacritic critic ratings than those who have not won any Oscars. This correlations suggests a similar pattern between critical assessment and award recognition, in that movies who are praised enough to win an Oscar are also favored highly among Rotten Tomatoes and Metacritic critics.

Figure 3: There is a slight positive association between Rotten Tomatoes critic rating and opening weekend revenue, but it is not very strong.

A movie’s Rotten Tomatoes critic rating is typically released before the movie hits theaters. Thus, I was interested in exploring the extent to which the success of this rating has on influencing the success of the movie’s opening weekend revenue. Figure 3 shows, however, that the correlation between these two variables is not very strong. There is a very slight positive association, suggesting that to some extent, as a movie’s Rotten Tomatoes critic rating increases, so does its opening weekend earnings. But, as this association is very weak, this means that the Rotten Tomatoes critic score does not have a drastic/direct impact on opening weekend revenue.

Figure 4: The average IMDb, Metacritic, and Rotten Tomatoes critic ratings for each unique script type combination of Hollywood movies.

Figure 4 visualizes the average IMDb, Metacritic, and Rotten Tomatoes critic ratings for each of the unique script type combinations of Hollywood movies from 2007-2022. One chief idea to note is that the average IMDb rating is only available for 5 out of the 16 script types, revealing a great amount of missingness within this variable and making it difficult to reach a conlusion about the relationship between script type and average IMDb rating. For the other two rating variables, the script type with both the highest Metacritic and Rotten Tomatoes critic ratings is “documentary”, suggesting that this script type is more favorable among critics than other script types.

Variable 2: Opening Weekend Revenue

A movie’s opening weekend revenue refers to the total box office earnings that the film earned during its first weekend of release in theaters.

Figure 5: A visualization of the change in yearly average opening weekend revenue for Hollywood movies from 2007-2022.

Figure 5 visualizes the change in the mean opening weekend earnings (in millions) for Hollywood movies from 2007-2022. As can be seen by the graph, there are two distinct low points on the graph corresponding to the years 2008 and 2020, and these drops can be explained by the economic state of the country during those years. In 2008, the country experienced a Great Recession of economic downturn, greatly impacting the film industry. This economic crisis led to a dramatic decline in consumer spending and movie production, possibly leading to the drop in mean opening weekend earnings that we see in the graph for this year. In 2020, we see a significantly more drastic drop in mean opening weekend revenue, as the COVID-19 pandemic led to a nationwide shut down/capacity limit of movie theaters. With these conditions, there was a dramatic decline in movie theater ticket sales and thus a dramtic drop in the mean opening weekend revenue of movies released during the pandemic, as shown in the graph. These findings are certainly something to keep in mind throughout this variable analysis, as the opening weekend revenue is highly impacted by economic crises such as the 2008 Great Recession and the 2020 COVID-19 pandemic.

Figure 6: The correlation between a Hollywood movie’s production budget and how much it earned during its opening weekend.

In Figure 6, there is a clear positive association between a Hollywood movie’s budget and its opening weeked earnings. This suggests that, on average, movies with higher production budgets tend to achieve greater financial success during their initial release weekends. From this, it can be concluded that the variables of movie budget and opening weekend revenue are related to one another, in that as the budget of movies increases, their opening weekend revenue earnings also increase.

Figure 7: The genres and script types of the top 5 movies that earned the most revenue during their opening weekends.

Figure 7 displays that the genre combination that earned the greatest average revenue during its opening weekend of release is sci-fi & fantasy, and the script type combination that earned the greatest average revenue during its opening weekend of release is sequel & adaptation. This suggests that the movies categorized as a sci-fi fantasy genre hybrid earned more during the first weekend of their release than other genre combinations, and movies categorized as a sequel adaptation script type hybrid also earned that title.

Figure 8: The disribution of opening weekend revenue for movies that have one at least won Oscar award and movies who have won 0 Oscars.

Figure 8 shows that Hollywood movies that have won at least one Oscar award or greater have an average opening weekend revenue that is actually less than movies that have not won any Oscars. This could suggest that the mean opening weekend success of a movie does not correlate with winning an Oscar, and these two variables are unrelated to one another. In other words, having a high opening weekend revenue may not increase a movie’s chance of winning an Oscar.

Figure 9: The relationship between a Hollywood movie’s Rotten Tomatoes critic score and its earnings during the first weekend of its release.

Figure 9 displays very strong, positive correlations for both associations of domestic gross by opening weekend revenue and foreign gross by opening weekend revenue. This suggests that a Hollywood movie’s performance during its opening weekend of release has a direct positive association with its overall domestic and foreign grosses. That is, as opening weekend earnings success increases, so will domestic and foreign gross successes. Additionally, the correlation between opening weekend revenue and domestic gross seems to be slightly steeper than the correlation between opening weekend revenue and foreign gross, suggesting that opening weekend revenue performance has a slightly greater impact on its domestic gross performance than it does its foreign gross performance.

Variable 3: Domestic, Foreign, & Worldwide Gross

Figure 10: A visualization of the change in yearly domestic gross for Hollywood movies from 2007-2022.

Figure 10 visualizes the change in the yearly average domestic gross (in millions) for Hollywood movies from 2007-2022. Just as in Figure 5, there are significant drops for the years 2008 and 2020, also due to the economy of the country during those years. With the 2008 Great Recession, declines in consumer spending due to the economic downturn directly impacted the total box office revenue of movies. With the 2020 COVID-19 pandemic, quarantining and the closing of movie theaters also led to declines in consumer spending and a direct decline in gross domestic revenue for movies. Like the opening weekend revenue variable, the domestic gross variable is heavily impacted by economic crises such as the 2008 Great Recession and the 2020 COVID-19 pandemic.

Figure 11: A visualization of the relationship between a Hollywood movie’s domestic and foreign gross.

Figure 11 displays a direct and strong positive correlation between the domestic gross earnings and foreign gross earnings of Hollywood movies. In other words, as the domestic gross earnings of a movie increases, its foreign gross earnings also increase. This suggests that US and foreign audiences have similar preferences in movie popularity.

Figure 12: The distributions of domestic and foreign gross for each genre of Hollywood movies.

Figure 12 seeks to explore another comparison of movie preference behavior between domestic and foreign audiences, this time by comparing gross performance among movie genres. In determining the most popular genres by highest average gross revenue between the two audiences, the “sci-fi” category has the best domestic performance, while the “action” and “adventure” categories are tied for the best foreign performance. This suggests that there is a difference in movie genre popularity between the two audiences, in that US movie audiences have a high preference for sci-fi category movies, while foreign movie audiences have a high preference for action and adventure movies. A sci-fi movie may perform better in the US than compared to foreign movie theaters, and action and adventure movies may perform better in foreign movie theaters.

Figure 13: A comparison of the top 5 movie distributors with the highest gross earnings between domestic and foreign gross.

As a final comparison of movie preference behavior between domestic and foreign audiences, Figure 13 explores the movie distributors with the top 5 highest average domestic and foreign gross revenues. For both US and foreign audiences, the movie distributor with the most successful gross performance is Walt Disney Studios. This reveals a similarity between domestic and foreign audiences in that movies distributed by Walt Disney Studios are more popular (generate more gross revenue) than movies released by other distributors.

Figure 14: A visualization of the relationship between a Hollywood movie’s worldwide gross revenue and the percent of its production budget that was recovered.

In Figure 14, there is a clear positive relationship between a movie’s worldwide gross earnings and the percent of the its budget that is recovered. This suggests that the greater box office revenue a movie earns, the more of its budget will be able to be earned back following its production/release into theaters.

Variable 4: Budget & Budget Recovery

Figure 15: A visualization of the change in average production budgets for Hollywood movies from 2007-2022.

Figure 15 follows the same patterns as Figure 5 and Figure 10, showing that the variable of movie budget is also highly impacted by economic crises. In this graph, there are also two distinct low points corresponding to the years 2008 and 2020. With the 2008 Great Recession, financial challenges could have resulted in cost-cutting measures and a more stringent approach to budgeting for movie distributors, leading to a lower average movie budget for that year. With the 2020 COVID-19 pandemic and quarantine, film studios may have altered their production strategies of their movies by delaying the start of filmmaking, leading to an overall decline in film production and thus a decline in mean budgets for that year. From these three similar variable findings, there seems to be a common trend that a movie’s success is greatly impacted by the economy.

Variable 5: Oscar Wins

Variable 6: Seasonal Release Date

These analyses seek to explore how the five main variables above vary/are impacted by the season a movie is released in, and what seasonal release date trends may exist in influencing a movie’s success rate.

Figure 16: A comparison of the mean ratings of movies based on the season they were released in, between the critic and audience rating groups.

Figure 16 shows a comparison between the average ratings for each season between the critic and audience rating groups. There appears to be a similar pattern for both rating groups’ seasonal average critic numbers, with the highest ratings given for movies released in the Fall, and the lowest ratings given for movies released in the Winter. This reveals a similarity in the seasonal patterns of movie ratings for the two rating groups. However, the taller bar graphs in the plot on the right depict a disparity between the two groups’ rating patterns in that the audience rating group gives out higher ratings than the critic rating group, as revealed in Figure 1. Figure 16 stands to visualize a way in which the rating patterns for these two groups are similar, and confirm a previous finding of a way that their patterns differ. An overall conclusion can be made that movies released in the Fall have the highest ratings, while movies released in the Winter have the lowest ratings.

Figure 17: A comparison of the average opening weekend revenues in millions of dollars for movies based on what season they were relased in.

In Figure 17, it is clear that movies with the highest average opening weekend revenue were released in the Spring. This could suggest that movies that are released in the Spring are more successful in terms of generating more earnings during their first weekend in theaters than movies released in other seasons.

Figure 18: The average total revenue generated by films from all sources globally by the season the film was released in.

Figure 18 shows that movies released during the Summer months have the highest average worldwide gross. This could be due to the fact that in many countries around the world, kids are on summer vacation during these months, and thus families are more likely to go to the movies and contribute to increased ticket sales.

Figure 19: A comparison of the average production budget of movies based on what season they were released in.

Figure 19 shows that movies released in the Spring have the highest average movie budgets. This directly aligns with previous findings in the EDA. In Figure 6, it was concluded that there exists a positive association between a Hollywood movie’s budget and its opening weeked earnings. Therefore, since Figure 17 revealed that the season of movies released with the highest average opening weekend revenue was Spring, then the season of movies released with the highest average movie budgets should also be the Spring, and that is what we see in this plot. This supports our finding of the positive correlation that exists between a movie’s budget and opening weekend revenue.

Figure 20: The distribution of Oscar wins based on what season the movie was released in.

In Figure 20, movies that were released in the Fall season won significantly more Oscars than movies released in other seasons. This is due to the fact that the Fall season is close to around the time when Oscar voting starts, and thus these films are more salient/relavent among the voters, but there is still enough time away from the start of voting for the films to gain enough popularity and traction before the awards are given out. From this, we can conclude that when defining a film’s success solely defined by the number of Oscar wins, releasing the film during the Fall season will greatly increase its chances of being successful.

Conclusion

text

References

text

Appendix: technical info

text

Appendix: extra explorations